Inducing structure in reward learning by learning features

نویسندگان

چکیده

Reward learning enables robots to learn adaptable behaviors from human input. Traditional methods model the reward as a linear function of hand-crafted features, but that requires specifying all relevant features priori, which is impossible for real-world tasks. To get around this issue, recent deep Inverse Reinforcement Learning (IRL) rewards directly raw state challenging because robot has implicitly are important and how combine them, simultaneously. Instead, we propose divide-and-conquer approach: focus input specifically on separately, only then them into reward. We introduce novel type teaching an algorithm utilizes it complex space. The can using demonstrations, corrections, or other frameworks. demonstrate our method in settings where have be learned scratch, well some known. By first focusing feature(s), decreases sample complexity improves generalization over IRL baseline. show experiments with physical 7-DoF manipulator, user study conducted simulated environment.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning Higher-Order Graph Structure with Features by Structure Penalty

In discrete undirected graphical models, the conditional independence of node labels Y is specified by the graph structure. We study the case where there is another input random vector X (e.g. observed features) such that the distribution P (Y | X) is determined by functions of X that characterize the (higher-order) interactions among the Y ’s. The main contribution of this paper is to learn th...

متن کامل

Inducing Effective Pedagogical Strategies Using Learning Context Features

Effective pedagogical strategies are important for e-learning environments. While it is assumed that an effective learning environment should craft and adapt its actions to the user’s needs, it is often not clear how to do so. In this paper, we used a Natural Language Tutoring System named Cordillera and applied Reinforcement Learning (RL) to induce pedagogical strategies directly from pre-exis...

متن کامل

Reinforcement Learning by Comparing Immediate Reward

This paper introduces an approach to Reinforcement Learning Algorithm by comparing their immediate rewards using a variation of Q-Learning algorithm. Unlike the conventional Q-Learning, the proposed algorithm compares current reward with immediate reward of past move and work accordingly. Relative reward based Q-learning is an approach towards interactive learning. Q-Learning is a model free re...

متن کامل

Learning reward expectations in honeybees.

The aim of this study was to test whether honeybees develop reward expectations. In our experiment, bees first learned to associate colors with a sugar reward in a setting closely resembling a natural foraging situation. We then evaluated whether and how the sequence of the animals' experiences with different reward magnitudes changed their later behavior in the absence of reinforcement and wit...

متن کامل

Active Reward Learning

While reward functions are an essential component of many robot learning methods, defining such functions remains a hard problem in many practical applications. For tasks such as grasping, there are no reliable success measures available. Defining reward functions by hand requires extensive task knowledge and often leads to undesired emergent behavior. Instead, we propose to learn the reward fu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: The International Journal of Robotics Research

سال: 2022

ISSN: ['1741-3176', '0278-3649']

DOI: https://doi.org/10.1177/02783649221078031